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(54) A method for computational graceful degradation in an audiovisual compression system 



(57) The invention disclosed here is a method for an 
encoder to encode audiovisual information for transmis- 
sion to the decoder without any prior knowledge of the 
computational capabilities of the decoder. A descriptor 
containing parameters that can be used to estimate the 
complexity of the decoding process is embedded in tiie 
system stream. The encoder also encodes the video 
information in such a manner that the decoder can 



choose to ignore some of the information and only 
decode a subset of the encoded information in order to 
reduce the computational requirements. This method 
allows more than one decoder to decode the same bit- 
stream giving different resolutions depending on the 
computational capability of the decoder. 
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Description 

BACKGROUND OF THE INVENTION 
1 . Field of the Invention 



[0001] The present invention relates to a method for 
computational graceful degradation in an audiovisual 
compression system This invention is useful in a multi- 
media encoding and decoding environment where the 
computational demarxte tor decoding a bitstream is not 
well defined. It ts also useful in cases where channel 
capacity is limited and some form of quality of service 
guarantee is required It is also useful for inter working 
between two video services ol different resolutions. 

2. Description of tne Belated Art 

[0002] It is common m the case of software decoding 
to employ some form of ^aceM degradation when the 
system resources is ncA suffoem to tutty decode all of 
the video bitstream These degradation ranges from 
partial decoding of the ptclure eiemerns to dropping of 
complete pictures Thts i& easy to irr^ptement in the case 
of a single video stream 

[0003] In the procK>sed new ISO/IEC SC29/WG11 
standard of MPEG-4. 4 is possijie to send multiple 
Audiovisual. AV, objects Theretore. the total complexity 
requirements no longer depend on one single stream 
but on multiple streams 

[0004] In compression systems such as MPEG-1, 
MPEG-2 and MPEG-4. a high degree of temporal 
redundancy is removed by employing motion compen- 
sation. It is intuitive to see that successive pictures in a 
video sequence will contain very similar information. 
Only regions of the picture that are moving will change 
from picture to picture. Furthernwe. these regions usu- 
ally move as a unit with uniform moton. Motion compen- 
sation is a technique where the encoder and the 
decoder keep the reconstructed picture as a reference 
for the prediction of the current picture being encoded or 
decoded. The encoder mirrocs the decoder by imple- 
menting a local decoder loop Thus, keeping the recon- 
structed picture synchronized t)etween the encoder and 
decoder. 

[0005] The encoder performs a search for a block in 
the reconstructed picture that gives the closest match to 
the current block that is being encoded. It then com- 
putes the prediction difference between the motion 
compensated block and the current block being 
encoded. Since the motion compensated block is avail- 
able in the encoder and the decoder, the encoder only 
needs to send the location of this block and the predic- 
tion difference to the decoder The location of the block 
is commonly referred to as the motion vector. The pre- 
diction difference is commonly referred to as the motion 
compensated prediction error. These information 
requires less bits to send that the current block itself. 



[0006] In intra-picture coding, spatial redundancy may 
be removed in a similar way. The transform coefficients 
of the block can be predicted from the transform predic- 
tion of its neighboring blocks that have already being 
5 decoded. 

[0007] There are two major problems to be solved in 
this invention. The first is how to indicate the decoding 
complexity requirements of the current AV object. In the 
case where there are multiple AV objects, the systems 
10 decoder must decide how much resource should be 
given to a particular object and which object should 
have priority over another. In other words, how to model 
the complexity requirements of the system. A point to be 
noted here is that the complexity requirements of the 
15 decoder is dependent on the implementation of the 
decoder. An operation that is complex for one imple- 
mentation may be simple for another implementation. 
Therefore, some form of implementation independent 
complexity measure is required. 

20 [0008] The second problem is how to reduce complex- 
ity requirements in the decoder. This deals with the 
method of reducing the complexity requirements of the 
decoding process while retaining as much of the infor- 
mation as possible. One biggest problem in graceful 

25 degradation is the problem of drift caused by errors in 
the motion compensation. When graceful degradation is 
employed the reconstructed picture is incomplete or 
noisy. These errors are propagated from picture to pic- 
ture resulting in larger and larger errors. This noise 

30 propagation is referred to as drift. 

SUMMARY OF THE INVENTION 

[0009] In order to solve the problems the following 

35 Steps are taken in the present invention. 

[001 0] The AV object encoder encodes the AV object 
in a manner that would allow different amounts of grace- 
ful degradation to be employed in the AV object 
decoder. Parameters relating to the computational com- 

40 plexity requirements of the AV objects are transmitted in 
the systems encoder. Implementation independent 
complexity measure is achieved by sending parameters 
that gives an indication of the operations that are 
required. 

45 [001 1 ] At the systems decoder, estimates of the com- 
plexity required are made based on these parameters 
as well as the implementation methods being employed. 
The resource scheduler then allocates the appropriate 
amount of resources to the decoding of the different AV 

50 objects. In the AV object decoder, computational grace- 
ful degradation is employed when the resources are not 
sufficient to decode the AV object completely 
[001 2] In accordance with a first aspect of the present 
invention, a method of encoding a plurality of audiovis- 

55 ual objects into a compressed coded representation 
suitable for computational graceful degradation at the 
decoder comprises: 
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encoding said audiovisual objects, incorporation 
methods allowing computational graceful degrada- 
tion to be employed in the decoder, into their coded 
r^resentations; 

estimating the implementation independent compu- 5 
tational complexity measures in terms of a plurality 
of block decoding parameters: 
partitioning said coded representations of the audi- 
ovisual objects into a plurality of access units and 
adding header information to form packets; io 
inserting a descriptor containing said block decod- 
ing parameters into the header of the packet; and 
multiplexing tiiese packets to form a single multi- 
plexed bitstream. 

15 

[0013] In accordance with a second aspect of the 
present invention, a method of decoding a multiplexed 
bitstream, with computational graceful degradation, to 
obtained a plurality of audiovisual objects, comprises: 

20 

de-multiplexing the single multiplexed bitstream into 
a plurality of packets comprising of packet headers 
and access units; 

extracting the descriptor containing a plurality of 
block decoding parameters from the packet head- 2$ 
ers; 

reassembling the access units Into their original 
coded representations of the audiovisual objects; 
estimating the decoder specific conrputational com- 
plexity measures based on said block decoding 3o 
parameters and tiie current decoder implementa- 
tion; and 

decoding said coded representations of tfie audio- 
visual objects, using computational graceful degra- 
dation, where necessary, to satisfy the estimated 35 
decoder specific computational complexity require- 
ments. 

[0014] Preferably, tiie incorporation methods allowing 
computational graceful degradation to be employed in 4o 
the decoder, comprise: 

partitioning the input pictures to be encoded into a 
plurality of sut>-regions numbered in increasing 
order, beginning with the full picture as the first sub- 45 
region, where each sub-region comprising only of a 
subset of tiie pixels within the sub-region preceding 
it; 

entropy coding the position and dimension of the 
sub-regions into a compressed coded representa- so 
tion witiiin the bitstream; 

further partitioning the sub-regions into a plurality of 
blocks for encoding into a compressed coded rep- 
resentation within the bitstream; 
performing motion estimation and motion compen- ss 
sation for said blocks using only the pixels from the 
reconstructed picture that belong to sub-regions 
having the same or higher numeric order as said 



blocks; 

entropy coding the motion vectors into a com- 
pressed coded representation within the bitstream; 
transforming tiie motion compensated prediction 
difference into an orthogonal domain; 
quantizing the transformed coefficients using a 
quantization method; and, 

entropy coding the quantized transformed coeffi- 
cients into a compressed coded representation 
within the bitstream. 

[0015] Preferably, the method for decoding the coded 
representations of the audiovisual objects in accord- 
ance with the second aspect, using computational 
graceful degradation where necessary to satisfy the 
estimated decoder specific computational complexity 
requirements, further comprises: 

entropy decoding the position and dimension of the 
sub-regions from the compressed coded represen- 
tation within the bitstream; 

selecting only tiie blocks that are within the sub- 
region of interest for decoding ; 
entropy decoding the compressed coded represen- 
tation to give quantized transformed coefficients; 
inverse quantizing said quantized transformed 
coefficients to give the transformed coefficients; 
inverse transforming said transform coefficients to 
give the spatial domain motion compensated pre- 
diction difference; 

entropy decoding tiie motion vectors from the com- 
pressed coded representation within the bitstream; 
performing motion compensation for said blocks 
using only the pixels from the reconstructed picture 
tiiat belong to sub-regions having the same or 
higher numeric order as said blocks; and, 
reconstructing the picture and storing said picture 
in the frame memory for prediction of the next pic- 
hjre. 

[0016] Preferably, the method in accordance with the 
first aspect of tfie invention, whereby incorporation 
methods allowing computational graceful degradation to 
be employed in tiie decoder, further comprises: 

partitioning the input pictures to be encoded into a 
plurality of sub-regions numbered in increasing 
order, beginning with the full picture as the first sub- 
region, where each sub-region comprising only of a 
subset of the pixels within the sub-region preceding 
it; 

entropy coding the position and dimension of the 
sub-regions into a compressed coded representa- 
tion within tiie bitstream; 

furtiier partitioning the sub-regions into a plurality of 
blocks for encoding into a compressed coded rep- 
resentation within tiie bitstream; 
ti-ansforming said blocks into an orthogonal 
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domain; 

quantizing the transformed coefficients using a 
quantization method; 

performing quantized transform coefficient predic- 
tion for said blocks using only the corresponding s 
quantized transform coefficients from the blocks 
above and to the left that belong to sub-regions 
having the same or higher numeric order as said 
blocks; arxJ, 

entropy coding the predicted difference of the quan- io 
tized transformed coefficients into a compressed 
coded representation within the bitstream. 

[0017] Preferably, the method in accordance with the 
fifst eispect of the invention » comprises: is 

entropy decoding the position and dimension of the 
sub-regions from the compressed coded represen- 
tation within the bitstream; 

selecting only the blocks that are within the sub- so 
region of interest for decoding; 
entropy decoding the compressed coded represen- 
tation to give quantized transformed coefficients; 
performing quantized transform coefficient predic- 
tion for said blocks using only the corresponding 25 
quantized transform coefficients from the blocks 
above and to the left that belong to sub-regions 
having the same or higher numeric order as said 
blocks; 

inverse quantizing said quantized transformed 30 
coefficients to give the transformed coefficients; 
inverse transforming said transform coefficients to 
give the spatial domain pixel values; and. 
reconstructing the picture and storing said picture 
in the frame memory for prediction of the next pic- 35 
ture. 

[001 8] Typically, the plurality of block decoding param- 
eters comprises numeric numbers indicating the 
number of: 40 

block entropy decoding operations; 

block motion compensation operation; 

block inverse quantization operations; 

block transform operations; 45 

block addition operations; and, 

block memory access operations. 

[001 9] Preferably, the descriptor comprises: 

50 

a descriptor identification number signaling the 
descriptor type; 

a descriptor length field to indicate the size of the 
descriptor; and. 

a plurality of block decoding parameters. 55 

[0020] Typically. In the method of partitioning the input 
pictures to be encoded into a plurality of sub-regions, 



the sub-regions are rectangular. 
[0021] Preferably, in the method of performing motion 
estimation and motion compensation for said blocks, 
using only the pixels from the reconstructed picture that 
belong to sub-regions having the same or higher 
numeric order as said blocks, implies that only predic- 
tion blocks that lie completely within said sub-regions 
are selected. 

[0022] Typically, when only the pixels from the recon- 
structed picture that belong to sub-regions having the 
same or higher numeric order as said blocks are used, 
prediction blocks may lie partially outside said sub- 
regions but with the additional condition that the pixels 
lying outside said sub-region are replaced by the near- 
est pixels from within the sub-regions. 
[0023] Preferably, in the method of partitioning the pic- 
tures Into a plurality of sub-regions, the position and 
dimension of each of said sub-regions may vary from 
picture to picture and said position and said dimension 
are coded by means of a pan scan vector, giving the 
horizontal and vertical displacement, a width and a 
height. 

[0024] Typically, in the method of partitioning the pic- 
tures into a plurality of sub-regions, the position and 
dimension of the sub regions are the same from picture 
to picture and said position and said dimension are 
coded once at the beginning of the sequence by means 
of a horizontal and vertical displacement, a ymdth and a 
height. 

[0025] Preferably, in the method of encoding and 
decoding, the transform is the Discrete Cosine Trans- 
form. 

[0026] Typically in the method of encoding and decod- 
ing, the number of sub-regions is two. 
[0027] Preferably, in the method where there is a plu- 
rality of sub-region numbered in increasing order and 
the motion vector can point into a sub-region of lower 
order but not out of a lower order to a higher ordered 
number 

BRIEF DESCRIPTION OF THE DRAWINGS 
[0028] 

Rgure 1 is an overall block diagram of the present 
Invention; 

Rgure 2 shows a block diagram of encoder and 
decoder of the present invention; 
Rgure 3 illustrates the embodiment of the sub- 
region and the motion vector restriction of the 
present invention; 

Rgure 4 Illustrates the embodiment for the pan- 
scan vectors and the sub-region dimensions in the 
present invention: 

Rgure 5 illustrates the second embodiment for the 
padding method of the motion compensated predic- 
tion at the sub-region boundary; and, 
Figure 6 illustrates the block diagram for the Gom- 
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plexity Estimator. 

DESCRIPTiON OF THE PREFERRED EMBODI- 
MENTS 

[0029] Rgure 1 shows an overall system block dia- 
gram of the present invention. Encoder unit 110 
encodes the video sequence to allow computational 
graceful degradation techniques. The output of encoder 
110 is a coded representation of the video sequence 
that is applied to an encoding buffer 120. At the same 
time the video sequence and the coded representation 
are also applied to a conplexity parameter encoder 1 30 
where the parameters associated with the operation 
that are required for decoding is computed and 
encoded. These information together with the output of 
the encoding buffer 120 are passed to a System 
Encoder and (Viultiplexer unit 140 where a system-multi- 
plexed stream is formed. The system-multiplexed 
stream is transmitted through a transmission media 
150. 

[0030] A Demultiplexer and System Decoder unit 160 
receives the system-multiplexed stream, where the bit- 
stream is demultiplexed Into its respective elementary 
streams. The video elemerrtary stream is passed to a 
Decoding Buffer 170. and complexity parameters are 
passed to a Scheduler and Complexity Estimator unit 
180. From the Decoding Buffer 170, the video elemen- 
tary stream is passed to a Decoder unit 190. The 
decoder 190 waits for the commands coming from the 
Scheduler unit 180 before decoding. 
[0O31] The Complexity Estimator 180 gives the 
amount of decoder computational graceful degradation 
that is to be employed. Computational graceful degra- 
dation is achieved in the decoder by decoding only a 
sub-region of the complete picture that is deemed to 
contain the more important information. The encoder 
will have to prevent the encoder and decoder from drift- 
ing apart under these conditions. After decoding, the 
decoder unit 190 also feedt>ack information to the 
Scheduler and Complexity Estimator 180 so that the 
information may be used to estimate the complexity of 
the next picture. 

[0032] The following is the embodiment of the various 
units illustrated in the above invention shown in Figure 
1. 

[0033] Rgure 2 is a block diagram of the encoder and 
decoder according to the present embodiment. The 
input picture to the encoder 110 is segmented into 
blocks for processing. Temporal redundancy is removed 
from the picture by subti-acting the motion compensated 
picture of the previous picture from the current picture. 
The prediction difference is then transformed into the 
DCT domain in a DCT unit 111. The resulting DCT coef- 
ficients are then quantized in a Quantization unit 112. 
The quantized coefficients are then entropy coded in a 
Variable Lengtin Coding (VLC) unit 1 13 to form the com- 
pressed output bitstream. The encoder 1 10 also has a 



local decoder loop comprising of an Inverse Quantiza- 
tion unit 1 14. an Inverse DCT unit 1 15. a Frame Storage 
116, and a Motion Compensation unit 117. The local 
decoder loops mimics the decoder operations by 

5 inverse quantizing the coefficients and transforming it 
back into the spatial domain in the Inverse Quantization 
unit 114 and Inverse DCT unit 115. The output is then 
added to the output of the Motion Compensated unit 
117 to form the reconstructaJ picture. This picture is 

10 stored in the Frame Storage 1 16 for motion compensa- 
tion of the next picture. 

[0034] In tills embodiment tiie encoder units of Motion 
Estimation unit 118 and Motion Compensation unit 1 17 
are changed so that computational graceful degradation 
15 may be performed in conjunction with the motion com- 
pensation without causing drift. 

[0035] Figure 3 illustrates the present invention, 
according to which the picture is divided into two parts 
220 and 21 0. The first part 220 is a sub-region that must 

20 be decoded in the decoder regardless of whether com- 
putational graceful degradation is employed or not. The 
second part 210 is the region outside of the sub-region, 
which may be discarded by the decoder when computa- 
tional graceful degradation is employed. 

25 [0036] Figure 3 also show two blocks that are used for 
motion compensation. When motion compensation is 
performed on a block 250 that resides in the sub-region 
220. the motion compensated prediction block must 
also come from within tiie sub-region 220 of the refer- 

30 ence picture. In other words the motion vector 260 
pointing out of the region is not allowed. This is referred 
to restricted motion vector. On the other hand, when a 
block 230 resides outside the sub-region 220, the 
motion compensated prediction block can come from 

35 anywhere in tiie reference picture. This is the same as 
where there is no sub-region. 

[0037] Figure 4 shows a method how to indicate the 
sub-region 220 within each picture. In order to specify 
the rectangular sub-region 220 for each picture the fol- 

40 lowing parameters must be specified for each picture 
and be encoded in the picture header of the conpress 
bitstream. In Figure 4. a picture 310 and the sub-region 
220 is illustrated. The horizontal offset 330 of the left 
edge of sub-region 220 from the left edge of the picture, 

45 and the vertical offset 340 of the top edge of the sub- 
region 220 from the top edge of the pictijre are shown. 
These two parameters, referred to as tiie pan scan vec- 
tors, are used to indicate the location of the sub-region. 
The width 350 and tiie height 360 of the sub-region 220 

50 are the second set of parameters that are required to 
specify the dimensions of the sub-region 220. 
[0038] In a second embodiment of this invention, the 
motion vector for a block in the sub-region need not be 
restricted. It is allowed to point out of the sito-region of 

55 the reference picture. However padding is required. This 
is illustrated in Figure 5 in which the picture 310 and the 
sub-region 220 are shown. The motion compensated 
prediction 430 is shown straddling the boundary of the 



5 



9 



EP 0 912 063 A2 



10 



sub-region 220. A portion 431 of the block residing out- 
side of the sub-region 220 is not used for prediction and 
is padded by repeating the value of the pixel found at 
the edge of the sub-region 220. A portion 432 of the 
block residing in the sub-region 220 is used without any 
padding. A similar padding nnethod is used for the rows 
and columns for blocks located at the vertical edge and 
horizontal edge, respectively. 

[0039] Like the first embodiment, the method accord- 
ing to the second embodiment would also enable com- 
putational graceful degradation method to discard the 
portion of the picture outside the sub-region 220 without 
causing the encoder and decoder to drift apart. 
[0040] Apart from motion compensation that may 
cause drift in inter blocks, intra blocks at the top and left 
boundary of the sub-region 220 are also restricted from 
using any blocks outside of the sub-region 220 for pre- 
diction. This is because in the computational graceful 
degraded decoder, these blocks would not be decoded 
and thus the prediction cannot be duplicated. This pre- 
cludes the commonly used DC and AC coefficient pre- 
diction from being employed in the encoder. 
[0041] Figure 2 also illustrates a block diagram of a 
decoder 190. The embodiment of the decoder 190 
employing computational graceful degradation is 
described here. The compressed bitstream is received 
from the transmission and is passed to a Variable 
Length Decoder unit 191 where the bitstream is 
decoded according to the syntax and entropy method 
used. The decoded information is then passed to the 
Computational Graceful Degradation Selector 192 
where the decoded information belonging to the sub- 
region 220 is retained and the decoded information out- 
side of the sub-region 220 is discarded. The retained 
information is then passed to an Inverse Quantization 
unit 193 where the DCT coefficients are recovered. The 
recovered coefficients are then passed to an Inverse 
DCT unit 194 where the coefficients are transformed 
back to the spatial domain. The motion compensated 
prediction is then added to form the reconstructed pic- 
ture. The reconstructed picture is stored in a Frame 
Storage 1 95 where it is used for the prediction of the 
next picture. A Motion compensation unit 196 performs 
the motion compensation according to the same 
method employed in the encoder 110. 
[0042] In the first embodiment of the encoder where 
the motion vector is restricted no additional modification 
is required in the decoder. In the second embodiment of 
the encoder where the motion vector is not restricted, 
the motion compensation method with padding 
described above in connection with Fig. 5 is used In the 
decoder. Finally, intra blocks at the top and left bound- 
ary of the sub-region 220 are also restricted from using 
any blocks outside of the sub-region 200 for prediction. 
This precludes the commonly used DC and AC coeffi- t 
cient prediction from being employed. 
[0043] In this embodiment the Complexity Parameter 
Encoder consist of a counting unit that counts the 



number of block decoding operations that are required. 
The block decoding operations are not basic arithmetic 
operations but rather a collection of operations that are 
performed on a block. A block decoding operation can 
5 be a block inverse quantization cperation, a block 
inverse DCT operation, a block memory access or some 
other collection of operations that perform some decod- 
ing task on the block by block basis. The Complexity 
Parameter Encoder counts the number of blocks that 
10 require each set of operations and indicate these in the 
parameters. The reason block decoding operations are 
used instead of simple arithmetic operations is because 
different implementations may implement different oper- 
ations more efficiently than others. 
IS [0044] There is also a difference in decoder architec- 
ture and different amounts of hardware and software 
solutions that makes the use of raw processing power 
and memory access measures unreliable to indicate the 
complexity requirements. However, if the operations 
20 required are indicated by parameters that counts the 
number of block decoding operations necessary, the 
decoder can estimate the complexity. This is because 
the decoder knows the amount of operations required 
for each of the block decoding operations in its own 
25 implementation. 

[0045] In the embodiment of the System Encoder and 
Multiplexer, the elementary bitstream are packefized 
and multiplexed for transmission. The information asso- 
ciated with the complexity parameters is also multi- 
30 plexed into the bitstream. This information is inserted 
into the header of the packets. Decoders that do not 
require such information may simply skip over this infor- 
mation. Decoders that require such information can 
decode this information and interpret them to estimate 
35 the complexity requirements. 

[0046] In this embodiment the encoder inserts the 
information in the form of a descriptor in the header of 
the packet. The descriptor contains an ID to indicate the 
type of descriptor it is followed by the total number of 
40 bytes contained in the descriptor. The rest of the 
descriptor contains the parameter for each of the block 
decoding operations. Optionally the descriptor may also 
carry some user defined parameters that are not 
defined earlier. 
#5 [0047] In the Scheduler and Complexity Estimator 1 80 
in Figure 1. the time It takes for decoding all the audio- 
visual objects is computed based on the parameters 
found in the descriptor as well as the feedback informa- 
tion from the decoder. 
yo [0048] An embodiment of the Complexity Estimator 
180 is shown in Figure 6. The block decoding operation 
parameters 181a, 181b and 181c are passed into the 
complexity estimator 183 after being pre-multiplied with 
weightings 182a. 182b and 182c, respectively The 
•5 complexity estimator 1 83 then estimates the complexity 
of the picture to be decoder and passes the estimated 
complexity 184 to the decoder 190. After decoding the 
picture the decoder 190 returns the actual complexity 
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185 of the picture. An error 186 in the complexity esti- 
mation is obtained by taking a difference between the 
estimated complexity 184 and the actual complexity 1 85 
of the picture. The error 186 is then passed into the 
feedback gain unit 187 where the corrections 188a, 5 
1 88b and 1 88c to the weightings are found. The weights 
are then modified by these corrections and the process 
of estimating the complexity of the next picture contin- 
ues. 

[0049] The effect of this invention is that the need for 10 
implementations that can handle the worst case Is no 
longer necessary. Using the indications of computa- 
tional complexities and the computational graceful deg- 
radation methods simpler decoders can be 
implemented. The decoder would have the capabilities is 
to decode most of the sequences, but if it encounters 
some more demanding sequences, it can degrade the 
quality and resolution of the decoder output in order to 
decode the bitstream. 

[0050] This invention is also useful for inter working of 20 
services that have different resolutions and/or different 
formats. The sub-region can be decoder by the decoder 
of lower resolutions where as the decoder of higher res- 
olutions can decode the full picture. One example is the 
inter working between 18:9 and 4:3 aspect ratio decod- 2$ 
ers. 

Claims 

1. A method for encoding a plurality of audiovisual 30 
objects into a compressed coded representation 
suitable for computational graceful degradation at 

the decoder comprising: 

encoding said audiovisual objects. incorfX)ra- 55 
tion methods allowing computational graceful 
degradation to be employed in the decoder, 
into their coded representations; 
estimating the implementation Independent 
computational complexity measures in terms of 40 
a plurality of block decoding parameters; 
partitioning said coded representations of the 
audiovisual objects into a plurality of access 
units and adding header information to form 
packets; 

inserting a descriptor containing said block 
decoding parameters into the header of the 
packet; and 

multiplexing these packets to form a single mul- 
tiplexed bitstream. 50 

2. A method of decoding a multiplexed bitstream. with 
computational graceful degradation, to obtained a 
plurality of audiovisual objects, comprising: 

55 

de-multiplexing the single multiplexed bit- 
stream into a plurality of packets comprising of 
packet headers and access units; 



extracting the descriptor containing a plurality 
of block decoding parameters from the packet 
headers; 

reassembling the access units into their origi- 
nal coded representations of the audiovisual 
objects; 

estimating the decoder specific computational 
complexity measures based on said block 
decoding parameters and the current decoder 
implementation; and 

decoding said coded representations of the 
audiovisual objects, using computational 
graceful degradation, where necessary, to sat- 
isfy the estimated decoder specific computa- 
tional complexity requirements. 

3. A method of encoding said audiovisual objects into 
tiieir compressed coded representations, according 
to claim 1, whereby incorporation methods allowing 
computational graceful degradation to be employed 
in the decoder, further comprises: 

partitioning the input pictures to be encoded 
into a plurality of sub-regions numbered in 
increasing order, beginning with the full picture 
as the first sub-region, where each sub-region 
comprising only of a subset of the pixels within 
the sub-region preening it; 
entropy coding the position and dimension of 
the sub-regions into a compressed coded rep- 
resentation witiiin the bitstream; 
further partitioning the sub-regions into a plu- 
rality of blocks for encoding into a compressed 
coded representation within the bitstream; 
performing motion estimation and motion com- 
pensation for said blocks using only the pixels 
from the reconstructed picture that belong to 
sub-regions having the same or higher numeric 
order as said blocks; 

entropy coding the motion vectors into a com- 
pressed coded representation within the bit- 
stream; 

transforming the motion conrpensated predic- 
tion difference into an orthogonal domain; 
quantizing the transformed coefficients using a 
quantization method; and 
entropy coding the quantized transformed coef- 
ficients into a compressed coded representa- 
tion within the bitstream. 

4. A method for decoding said coded representations 
of the audiovisual objects according to claim 2, 
using computational graceful degradation, where 
necessary, to satisfy the estimated decoder specific 
computational complexity requirements further 
comprising: 

entropy decoding the position and dimension of 
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the sub-regions from the compressed coded 
representation within the bitstream; 
selecting only the blocks that are within the 
sub-region of interest for decoding; 
entropy decoding the compressed coded rep- s 
resentation to give quantized transformed coef- 
ficients; 

inverse quantizing said quantized transformed 
coefficients to give the transformed coeffi- 
cients; 10 
inverse transforming said transform coeffi- 
cients to give the spatial domain motion com- 
pensated prediction difference; 
entropy decoding the motion vectors from the 
compressed coded representation within the is 
bitstream; 

performing motion compensation for said 
blocks using only the pixels from the recon- 
structed picture that belong to sub-regions hav- 
ing the same or higher numeric order as said 2o 
blocks; and 

reconstructing the picture and storing said pic- 
ture in the frame memory for prediction of the 
next picture. 

2S 

A method of encoding said audiovisual objects into 
their coded representations, according to claim 1, 
whereby incorporation methods allowing computa- 
tional graceful degradation to be employed in the 
decoder, further comprises: so 



6. A method for decoding said coded representations 
of the audiovisual objects according to claim 2. 
using computational graceful degradation, where 
necessary, to satisfy the estimated decoder specific 
computational complexity requirements further 
comprising: 

entropy decoding the position and dimension of 
the sub-regions from the compressed coded 
representation within the bitstream; 
selecting only the blocks that are within the 
sub-region of interest for decoding; 
entropy decoding the compressed coded rep- 
resentation to give quantized transformed coef- 
ficients; 

performing quantized transform coefficient pre- 
diction for said blocks using only the corre- 
sponding quantized transform coefficients from 
the blocks above and to the left that belong to 
sub-regions having the same or higher numeric 
order as said blocks; 

inverse quantizing said quantized transformed 
coefficients to give the transformed coeffi- 
cients; 

inverse transforming said transform coeffi- 
cients to give the spatial domain pixel values; 
and 

reconstructing the picture and storing said pic- 
ture in the frame memory for prediction of the 
next picture. 



partitioning the input pictures to be encoded 7. 
into a plurality of sub-regions numbered in 
increasing order, beginning with the full picture 
as the first sub-region, where each sub-region as 
comprising only of a subset of the pixels within 
the sub-region preceding it; 
entropy coding the position and dimension of 
the sub-regions into a compressed coded rep- 
resentation within the bitstream; 40 
further partitioning the sub-regions into a plu- 
rality of blocks for encoding into a compressed 
coded representation within the bitstream; 
transforming said blocks into an orthogonal 
domain; 4S 8. 

quantizing the transformed coefficients using a 
quantization method; 

performing quantized transform coefficient pre- 
diction for said blocks using only the corre- 
sponding quantized transform coefficients from so 
the blocks above and to the left that belong to 
sub-regions having the same or higher numeric 
order as said blocks; and 
entropy coding the predicted difference of the 
quantized transformed coefficients into a com- ss 9. 
pressed coded representation within the bit- 
stream. 



A method for estimating the implementation inde- 
pendent computational complexity measures, as in 
claim 1, whereby the plurality of block decoding 
parameters comprises numeric numbers indicating 
the number of 

block entropy decoding operations; 
block motion compensation operation; 
block inverse quantization operations; 
block transform operations; 
block addition operations; and 
block memory access operations. 

A method of encoding the bfock decoding parame- 
ters in the header of the packet as in claim 1 , where 
the descriptor comprises: 

a descriptor identification number signaling the 
descriptor type; 

a descriptor length field to indicate the size of 
the descriptor; and 

a plurality of block decoding parameters. 

A method of partitioning the input pictures to be 
encoded into a plurality of sub-regions according to 
claims 3 and 5, where said sub-regions are rectan- 
gular. 



8 



15 



EP 0 912 063 A2 



10. A method of performing motion estimation and 
motion connpensation for said blocks according to 
claim 3, whereby using only the pixels from the 
reconstructed picture that belong to sub-regions 
having the same or higher numeric order as said 5 
blocks, implies that only prediction blocks that lie 
completely within said sub-regions are selected. 

11. A method of performing motion estimation and 
motion compensation for said blocks according to 10 
claim 3. whereby using only the pixels from the 
reconstructed picture that belong to sub-regions 
having the same or higher numeric order as said 
blocks, implies that prediction blocks may lie par- 
tially outside said sub-regions but with the addi- is 
tional condition that tiie pixels lying outside said 
sub-region are replaced by the nearest pixels from 
within the sub-regions. 

1 2. A method of partitioning the pictures into a plurality 20 
of sub-regions according to claims 3 and 5. where 

the position and dimension of each of said sub- 
regions may vary from picture to picture and said 
position and said dimension are coded by means of 
a pan scan vector, giving the horizontal and vertical 25 
displacement, a width and a height. 

13- A method of partitioning the pictures into a plurality 
of sub-regions according to claims 3 and 5. where 
the position and dimension of tiie sub regions are 30 
the same from picture to picture and said position 
and said dimension are coded once at the begin- 
ning of the sequence by means of a horizontal and 
vertical displacement, a width and a height. 

35 

14. A method of encoding and decoding according to 
claims 3, 4, 5 and 6. where the transform is the Dis- 
crete Cosine Transform. 

15. A method of encoding and decoding according to 40 
claims 3. 4, 5 and 6, where the number of sub- 
regions is two. 

16. A method where there is plurality of sub-region 
numbered in increasing order and the motion vector 45 
can point into a sub region of lower order but not out 

of a lower order to a higher ordered number. 
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(57) The invention disclosed here is a method for an 
encoder to encode audiovisual information for transmis- 
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containing parameters that can be used to estimate the 
complexity of the decoding process is embedded in the 
system stream. The encoder also encodes the video 
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Fig. 1 



choose to ignore some of the information and only 
decode a subset of the encoded information in order to 
reduce the computational requirements. This method 
allows more than one decoder to decode the same bit- 
stream giving different resolutions depending on the 
computational capability of the decoder. 
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